910 research outputs found
SentiCap: Generating Image Descriptions with Sentiments
The recent progress on image recognition and language modeling is making
automatic description of image content a reality. However, stylized,
non-factual aspects of the written description are missing from the current
systems. One such style is descriptions with emotions, which is commonplace in
everyday communication, and influences decision-making and interpersonal
relationships. We design a system to describe an image with emotions, and
present a model that automatically generates captions with positive or negative
sentiments. We propose a novel switching recurrent neural network with
word-level regularization, which is able to produce emotional image captions
using only 2000+ training sentences containing sentiments. We evaluate the
captions with different automatic and crowd-sourcing metrics. Our model
compares favourably in common quality metrics for image captioning. In 84.6% of
cases the generated positive captions were judged as being at least as
descriptive as the factual captions. Of these positive captions 88% were
confirmed by the crowd-sourced workers as having the appropriate sentiment
Automatic Image Captioning with Style
This thesis connects two core topics in machine learning, vision
and language. The problem of choice is image caption generation:
automatically constructing natural language descriptions of image
content. Previous research into image caption generation has
focused on generating purely descriptive captions; I focus on
generating visually relevant captions with a distinct linguistic
style. Captions with style have the potential to ease
communication and add a new layer of personalisation.
First, I consider naming variations in image captions, and
propose a method for predicting context-dependent names that
takes into account visual and linguistic information. This method
makes use of a large-scale image caption dataset, which I also
use to explore naming conventions and report naming conventions
for hundreds of animal classes. Next I propose the SentiCap
model, which relies on recent advances in artificial neural
networks to generate visually relevant image captions with
positive or negative sentiment. To balance descriptiveness and
sentiment, the SentiCap model dynamically switches between two
recurrent neural networks, one tuned for descriptive words and
one for sentiment words. As the first published model for
generating captions with sentiment, SentiCap has influenced a
number of subsequent works. I then investigate the sub-task of
modelling styled sentences without images. The specific task
chosen is sentence simplification: rewriting news article
sentences to make them easier to understand.
For this task I design a neural sequence-to-sequence model that
can work with
limited training data, using novel adaptations for word copying
and sharing
word embeddings. Finally, I present SemStyle, a system for
generating visually
relevant image captions in the style of an arbitrary text corpus.
A shared term
space allows a neural network for vision and content planning to
communicate
with a network for styled language generation. SemStyle achieves
competitive
results in human and automatic evaluations of descriptiveness and
style.
As a whole, this thesis presents two complete systems for styled
caption generation that are first of their kind and demonstrate,
for the first time, that automatic style transfer for image
captions is achievable. Contributions also include novel ideas
for object naming and sentence simplification. This thesis opens
up inquiries into highly personalised image captions; large scale
visually grounded concept naming; and more generally, styled text
generation with content control
UNIPoint: Universally Approximating Point Processes Intensities
Point processes are a useful mathematical tool for describing events over
time, and so there are many recent approaches for representing and learning
them. One notable open question is how to precisely describe the flexibility of
point process models and whether there exists a general model that can
represent all point processes. Our work bridges this gap. Focusing on the
widely used event intensity function representation of point processes, we
provide a proof that a class of learnable functions can universally approximate
any valid intensity function. The proof connects the well known
Stone-Weierstrass Theorem for function approximation, the uniform density of
non-negative continuous functions using a transfer functions, the formulation
of the parameters of a piece-wise continuous functions as a dynamic system, and
a recurrent neural network implementation for capturing the dynamics. Using
these insights, we design and implement UNIPoint, a novel neural point process
model, using recurrent neural networks to parameterise sums of basis function
upon each event. Evaluations on synthetic and real world datasets show that
this simpler representation performs better than Hawkes process variants and
more complex neural network-based approaches. We expect this result will
provide a practical basis for selecting and tuning models, as well as
furthering theoretical work on representational complexity and learnability
Dynamics Inside the Radio and X-ray Cluster Cavities of Cygnus A and Similar FRII Sources
We describe approximate axisymmetric computations of the dynamical evolution
of material inside radio lobes and X-ray cluster gas cavities in Fanaroff-Riley
II sources such as Cygnus A. All energy is delivered by a jet to the
lobe/cavity via a moving hotspot where jet energy dissipates in a reverse
shock. Our calculations describe the evolution of hot plasma, cosmic rays (CRs)
and toroidal magnetic fields flowing from the hotspot into the cavity. Many
observed features are explained. Gas, CRs and field flow back along the cavity
surface in a "boundary backflow" consistent with detailed FRII observations.
Computed ages of backflowing CRs are consistent with observed radio-synchrotron
age variations only if shear instabilities in the boundary backflow are damped
and we assume this is done with viscosity of unknown origin. Magnetic fields
estimated from synchrotron self-Compton (SSC) X-radiation observed near the
hotspot evolve into radio lobe fields. Computed profiles of radio synchrotron
lobe emission perpendicular to the jet are dramatically limb-brightened in
excellent agreement with FRII observations although computed lobe fields exceed
those observed. Strong winds flowing from hotspots naturally create kpc-sized
spatial offsets between hotspot inverse Compton (IC-CMB) X-ray emission and
radio synchrotron emission that peaks 1-2 kpc ahead where the field increases
due to wind compression. In our computed version of Cygnus A, nonthermal X-ray
emission increases from the hotspot (some IC-CMB, mostly SSC) toward the offset
radio synchrotron peak (mostly SSC). A faint thermal jet along the symmetry
axis may be responsible for redirecting the Cygnus A non-thermal jet.Comment: 24 pages, 10 figures, accepted by Ap
Packing 3-vertex paths in claw-free graphs and related topics
An L-factor of a graph G is a spanning subgraph of G whose every component is
a 3-vertex path. Let v(G) be the number of vertices of G and d(G) the
domination number of G. A claw is a graph with four vertices and three edges
incident to the same vertex. A graph is claw-free if it has no induced subgraph
isomorphic to a claw. Our results include the following. Let G be a 3-connected
claw-free graph, x a vertex in G, e = xy an edge in G, and P a 3-vertex path in
G. Then
(a1) if v(G) = 0 mod 3, then G has an L-factor containing (avoiding) e, (a2)
if v(G) = 1 mod 3, then G - x has an L-factor, (a3) if v(G) = 2 mod 3, then G -
{x,y} has an L-factor, (a4) if v(G) = 0 mod 3 and G is either cubic or
4-connected, then G - P has an L-factor, (a5) if G is cubic with v(G) > 5 and E
is a set of three edges in G, then G - E has an L-factor if and only if the
subgraph induced by E in G is not a claw and not a triangle, (a6) if v(G) = 1
mod 3, then G - {v,e} has an L-factor for every vertex v and every edge e in G,
(a7) if v(G) = 1 mod 3, then there exist a 4-vertex path N and a claw Y in G
such that G - N and G - Y have L-factors, and (a8) d(G) < v(G)/3 +1 and if in
addition G is not a cycle and v(G) = 1 mod 3, then d(G) < v(G)/3.
We explore the relations between packing problems of a graph and its line
graph to obtain some results on different types of packings. We also discuss
relations between L-packing and domination problems as well as between induced
L-packings and the Hadwiger conjecture.
Keywords: claw-free graph, cubic graph, vertex disjoint packing, edge
disjoint packing, 3-vertex factor, 3-vertex packing, path-factor, induced
packing, graph domination, graph minor, the Hadwiger conjecture.Comment: 29 page
- …